Distributed information-theoretic clustering
نویسندگان
چکیده
Abstract We study a novel multi-terminal source coding setup motivated by the biclustering problem. Two separate encoders observe two i.i.d. sequences $X^n$ and $Y^n$, respectively. The goal is to find rate-limited encodings $f(x^n)$ $g(z^n)$ that maximize mutual information $\textrm{I}(\,{f(X^n)};{g(Y^n)})/n$. discuss connections of this problem with hypothesis testing against independence, pattern recognition bottleneck method. Improving previous cardinality bounds for inner outer allows us thoroughly special case binary symmetric quantify gap between bound in case. Furthermore, we investigate multiple description (MD) extension CEO constraint. Surprisingly, MD-CEO permits tight single-letter characterization achievable region.
منابع مشابه
Information Theoretic Clustering
Clustering is one of the important topics in pattern recognition. Since only the structure of the data dictates the grouping (unsupervised learning), information theory is an obvious criteria to establish the clustering rule. This paper describes a novel valley seeking clustering algorithm using an information theoretic measure to estimate the cost of partitioning the data set. The information ...
متن کاملInformation Theoretic Hierarchical Clustering
Hierarchical clustering has been extensively used in practice, where clusters can be assigned and analyzed simultaneously, especially when estimating the number of clusters is challenging. However, due to the conventional proximity measures recruited in these algorithms, they are only capable of detecting mass-shape clusters and encounter problems in identifying complex data structures. Here, w...
متن کاملDemystifying Information-Theoretic Clustering
Greg Ver Steeg [email protected] Aram Galstyan [email protected] Fei Sha [email protected] Simon DeDeo [email protected] 1 Information Sciences Institute, 4676 Admiralty Way, Marina del Rey, CA 90292, USA 2 University of Southern California, Los Angeles, CA 90089, USA 3 Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA 4 School of Informatics and Computing, Indiana University, 901 E 1...
متن کاملInformation Theoretic Pairwise Clustering
In this paper we develop an information-theoretic approach for pairwise clustering. The Laplacian of the pairwise similarity matrix can be used to define a Markov random walk on the data points. This view forms a probabilistic interpretation of spectral clustering methods. We utilize this probabilistic model to define a novel clustering cost function that is based on maximizing the mutual infor...
متن کاملNonparametric Information Theoretic Clustering Algorithm
In this paper we propose a novel clustering algorithm based on maximizing the mutual information between data points and clusters. Unlike previous methods, we neither assume the data are given in terms of distributions nor impose any parametric model on the within-cluster distribution. Instead, we utilize a non-parametric estimation of the average cluster entropies and search for a clustering t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Information and Inference: A Journal of the IMA
سال: 2021
ISSN: ['2049-8772', '2049-8764']
DOI: https://doi.org/10.1093/imaiai/iaab007